27112 relay cell processing abstraction#8
Closed
dmr-x wants to merge 29 commits intotorproject:masterfrom
Closed
Conversation
It's intended to defined common Relay cell functionality, for encrypted and decrypted cells. By unpacking content no further than the payload, this abstraction should allow a lot of flexibility in handling Relay cells prior to decryption. Note that unpacking is not yet possible - subclasses must be defined with a VALUE. (Namely: RELAY and RELAY_EARLY)
stem.client presently makes typical use of non-negative inputs with unsigned types.
Since RELAY cells come across the wire encrypted, they cannot be directly unpacked - their payload must first be decrypted. Note that this breaks unpacking RelayCell in the interim.
Notably, this along with CANNOT_DIRECTLY_UNPACK for RelayCell (prior change) allows Cell.pop() and Cell.unpack() to be executed directly on received bytes from a connection, and still process encrypted cells correctly into RELAY cells, to be decrypted later. RelayCell tests marked with TODO where temp-fix was applied. RelayCell implementation will gradually be improved/refactored into better supporting subclasses per command.
The line is long, so this change also extends the vertically aligned table format for other Cell classes, in a forward-looking manner. Arguably this could've been part of the previous change, but it made sense to break it out for readability purposes.
This is a proof-of-concept change to show the benefits of migrating finer details into the Cell abstraction. It still arguably violates the Cell abstraction by using RelayCell._unpack(), but that can be addressed in a later change.
s/:parm /:param / s/:raise:/:raises:/ s/:return:/:returns:/
This allows the logic to be reused. It may ultimately be better represented as a @Property, but for now this should do fine.
Eventually, having this separated out will help during decryption of cells in multi-hop circuits - each layer of decryption must attempt to interpret the payload in order to ascertain whether it has been fully decrypted.
It's important to decouple packing the payload from packing the whole cell. This is probably not the final form this method will take, but in the interim, it should help reduce the magnitude of violating the Cell abstraction layer, for any payload encryption/decryption.
This is more appropriate since the spec allows for (and recommends!) different behavior for these bytes than other padding bytes. However, presently the handling is identical. It is also important that the _pack_payload() method be able to produce a full-size payload, for simplifying encryption.
This is a continuation of proof-of-concept changes. Again, this is not the final form that encryption will take on.
There may be a bit too much going on here - i.e. these methods may not be anything but a bit redundant - but things will get trimmed down later when the API is better worked out.
This method - currently unused - allows moving some of the digest-application logic into the Cell abstraction layer. It is a bit weird for it to have side-effects on a parameter (digest), but it's documented and works fairly logically.
The abstraction layers are starting to look better separated.
Per discussion with Damian over IRC, instances of our Cell classes are intended to be immutable objects. So instead of modifying the RelayCell and having an additional side effect on the digest, this provides a modified copy of each, in a tuple. That makes apply_digest() a pure function, currently with referential transparency (i.e. also deterministic). In the future, it may utilize randomness for cell padding, which would make it non-deterministic. Overall this now more generally follows a functional-programming paradigm.
e02cf0e to
f3f44a2
Compare
Contributor
Author
|
branch rebased and force-pushed |
Much of the updates here is to reduce the line length.
This further tunes the abstraction layers such that consumers don't need to care at all what part of the Cell is encrypted. They are still responsible for managing the digest/encryption key states, but don't need to know the finer details of the encryption.
Similar to how apply_digest() is used by a RELAY cell sender, this collects the logic needed for a receiver to confirm that a cell is fully decrypted and that the cell's integrity is intact. This is new functionality. Prior to this change, stem.client had no facility to check the digest of a RELAY cell according to the spec. Not yet used.
This is new functionality. (Nothing in stem.client currently attempts to check the 'recognized' field.) Not yet used. This is arguably a very inefficient method, and it might even obscure its purpose by existing (it's extremely simple), but it seemed better to break it out and deal with the foreseeable performance hit than to trend towards premature optimization. (It also is a bit counterintuitive for a value of 0 to mean True, so breaking this out makes that clearer.) There are many ways this could be optimized, and - especially since so much of this unit is still in flux - it doesn't make sense yet to do so.
This method facilitates separating decryption / recognition of a cell from the interpretation of a cell. Many places in the spec indicate that cells should be dropped (ignored) under certain conditions, but that such cells should still be accounted for within the running decryption / digest. Furthermore, it may be useful in some cases to convert between raw cells and interpreted cells, without including the other steps of encryption / decryption. This additionally allows removing one of the interim TODOs in the RELAY cell unit tests. Arguably the tests need some rework with all these changes, but that can come a bit later. Not yet used, beyond the aforementioned unit tests. This method will probably evolve a decent amount, since the unit is still in a state of high flux.
Akin to encrypt(), this takes care of decryption and all the ancillary
functionality related to it.
It composes the functionality of the building blocks added in previous
changes, namely:
* check_recognized_field()
* check_digest()
* interpret_cell() - but not automatically done by default
A consumer of this, especially in a multi-hop circuit, will only need to
manage the digest/decryptor state (in the case of a cell that cannot be
decrypted+recognized... fallback) and pay attention to the
'fully_decrypted' result.
This should make a consumer's implementation pretty readable, something
like:
# relay_1 := nearest relay in circuit
decryption_states = [
(relay_1, digest_1, decryptor_1),
(relay_2, digest_2, decryptor_2),
(relay_3, digest_3, decryptor_3),
]
new_decryption_states = copy.deepcopy(decryption_states)
from_relay = None
for i in range(len(new_decryption_states)):
relay, digest, decryptor = new_decryption_states[i]
cell, fully_decrypted, digest, decryptor = cell.decrypt(digest, decryptor)
new_decryption_states[i] = (relay, digest, decryptor)
if fully_decrypted:
from_relay = relay
break
if from_relay:
decryption_states = new_decryption_states
else:
# handle this, since the cell couldn't be decrypted
# probably raise Something()
pass
cell_and_hop = (cell, from_relay) # naming things is hard
# do something with the decrypted cell
# ^ probably feeding it into the queue/handler for the (circuit_id, relay, stream_id)
This also now actually checks the 'recognized' field and digest, unlike before. Keep in mind that this implementation is really meant for illustrative purposes, in the interim. (There are some limitations to it.)
And rename the other 'orig_' vars to 'orig_forward_' vars, for clarity.
This arrangement is better, as it better keeps track of the digest/key on a cell-by-cell basis, but it's still not spec-compliant in a number of ways, such as its potential to drop cells or get corrupted backward digest/key state in the case of an exception. That's ok - this is just an interim implementation, and this change is just to make things a bit better for illustrative purposes.
…ecks, and tune abstraction layers
Work on stem.client, #27112
There are a few main improvements herein:
* creating an intermediate form for RELAY cells that does not yet
unpack the payload, since it would be encrypted on the wire
* moving the finer details of encryption/decryption of cell content
into the Cell abstraction layer (e.g. things like payload size)
* adding simple encrypt()/decrypt() methods
* checking the 'recognized' field and digest of decrypted cells
* refactoring some functionality into separate methods
* laying the groundwork for:
- subclasses for RELAY commands
- RELAY and RELAY_EARLY handling
- randomized padding in RELAY cells
- multi-hop circuits
Beyond that, this:
* slightly improves the Circuit send/receive of encrypted/decrypted
cells
* improves a few docstrings
* introduces a friendlier exception message for Size packing
https://trac.torproject.org/projects/tor/ticket/27112
f3f44a2 to
e8a7599
Compare
Contributor
Author
|
branch rebased and force-pushed (a few days ago) |
Contributor
|
Cleaning up old pull requests (dmr's work was merged a long while back). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Trac ticket